Species Distribution Modeling

Addax (Addax nasomaculatus) are considered one of the rarest antelopes on earth, with an estimated population of <100 individuals in the wild. We assessed the distribution and occurrence of this species from field surveys collected by the Sahara Conservation Fund over a 7-year period (2008-2014). Our results provide insight into the factors contributing to species occurrence and are guiding field surveys to areas that have the potential to support small and geographically isolated populations of addax. We incorporated field-derived variables of vegetation cover with remote sensing measures of vegetation productivity (NDVI - Normalized Difference Vegetation Index) and surface roughness (derived from 30-m SRTM). Models were fit in a generalized linear regression framework to evaluate and predict species occurrence.

Addax moving across shifting sands in the Tin Toumma desert, Niger (Photo: T.Rabeil, SCF)

Addax moving across shifting sands in the Tin Toumma desert, Niger (Photo: T.Rabeil, SCF)

Lab Excercise

In this excercise, we will follow the steps detailed in:

Stabach, J.A., T. Rabeil, V. Turmine, T. Wacher, T. Mueller, and P. Leimgruber. 2017. On the brink of extinction - Habitat selection of addax and dorcas gazelle across the Tin Toumma desert, Niger. Diversity and Distributions 23:581-591.

We will use R for all analyses to create a species distribution model, highlighting the variables important in predicting addax occurrence. Like other excercises in R, we will first load the necessary packages to perform the analyses. Use help() for any operations that you don’t understand or that require additional information.

Load Packages

library(usdm)
library(arm)
library(visreg)
library(pROC)
library(DAAG)
library(fields)
library(MuMIn)
library(gstat)
library(sp)
library(lubridate)
library(DT)
library(leaflet)

Data File

The raster data layers (NDVI and surface roughness) have already been extracted at plot locations (see SurfaceRoughness). This has been done to focus on the statistical modeling, rather that on the details to create the spatial dataset. Building the spatial database, however, is hugely important and should not be overlooked. The layers included in analyses will ultimately depend on your scientific research questions and what is biologically relevant for your species.

NDVI and surface roughness were summarized by calculating the mean value of all pixels within a 2.5-km radius of each plot location (a moving window analysis). Plot locations were spaced at 5-km intervals along line transects. Each line transect varied in length between 50-km and 100-km and were spaced ~10-km apart. Transects were repetitively sampled across years.

All data (including animals counts) were summarized at plot locations. We re-coded animal counts within a 2.5-km radius to a measure of occurrence (i.e, presence/absence). Thus, we modeled the data as a series of 1’s and 0’s, representing addax occurrence at plot locations. Data were aggregated in this fashion because of variability between surveys (i.e., the transects locations didn’t overlap exactly) and because we did not have confidence in the accuracy of the number of individuals recorded at each sighting. In addition, distance to animal sighting locations were only recorded in a subset of the surveys. Sightings >500-m from the transect were removed due to an assumed undercounting bias (confirmed by investigating the frequency of sighthings in relation to distance). This allowed for a conservative broad-scale approach to incorporate extremely messy field data collected over multiple years. See more details in Stabach et al. 2017.

# Load SpatialPointsDataFrame.  Set working directory if necessary.
load(file="Addax_Dataset2.RData")
# Look at data
head(Sub.All2)
##   Pt_ID       Date YearMonth Id_1        X       Y Addax Dorcas Cornul
## 1     1 2008-12-06    200812   23 767911.0 1742830     0      0      0
## 2     2 2008-12-06    200812   23 770769.0 1744528     0      0      0
## 3     3 2008-12-06    200812   23 776095.8 1747469     0      5      0
## 4     4 2008-12-06    200812   23 780842.3 1750128     0      4      0
## 5     5 2008-12-06    200812   23 785517.3 1752312     0     16      0
## 6     6 2008-12-06    200812   23 790001.9 1754484     0      6      0
##   Stipa1 Stipa2 Human      TRI           TPI    ROUGH Index      ndvi
## 1      0      0     0 2.692203  1.680855e-03 8.343433     1 0.1223870
## 2      0      0     0 2.483226 -9.582118e-05 7.700197     1 0.1219408
## 3      1      0     0 1.617903 -5.089201e-04 5.044250     1 0.1201930
## 4      1      0     0 1.708651  2.159401e-03 5.331611     1 0.1287899
## 5      0      0     0 2.000507  4.656293e-03 6.293324     1 0.1247775
## 6      1      0     0 2.029911  4.920087e-04 6.445662     1 0.1234554
##   Month Season      X.1     Y.1 obsAddax obsDorc Year
## 1    12   Cold 767911.0 1742830        0       0 2008
## 2    12   Cold 770769.0 1744528        0       0 2008
## 3    12   Cold 776095.8 1747469        0       1 2008
## 4    12   Cold 780842.3 1750128        0       1 2008
## 5    12   Cold 785517.3 1752312        0       1 2008
## 6    12   Cold 790001.9 1754484        0       1 2008
# Or get fancy with how the data table is displayed
#datatable(Sub.All2, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX = T))

Columns in the database include Date of survey, X and Y location of each plot, the number of addax and dorcas gazelle (a conspecific) sighted at each location, a unique plot ID (Pt_ID), and the presence/absence of vegetation species Cornulaca monocantha (Cornul), Stipagrostis acutiflora (Stipa1), and Stipagrostis vulnerans (Stipa2). These vegetation species were thought a priori to influence addax occurence and were collected at each plot location. Human disturbance (e.g., footprint, sighting, tire tracks) were also recorded (i.e., Human = 1).

Other variables, including surface roughness (Rough) and the Normalized Difference Vegetation Index (NDVI) were included as data layers in our analysis. Surface roughness is defined as the change in local elevation range (i.e., the difference between the minimum and maximum values of a cell and its eight surrounding neighbors). NDVI is known to be strongly correlated with a region’s vegetation productivity/greenness and has been used extensively as an important parameter in models predicting animal movement and habitat use. NDVI data (MOD13Q1) was downloaded as 16-day cloud-free data composites with a 250-m resolution.

Summarize Dataset

If we visualize the dataset, we see that data were collected multiple times a year and that the occupancy of addax (and dorcas gazelle) vary between year and season.

##    YearMonth Year Season PresAddax PresDorc PrevAdd PrevDorc  One Both
## 1     200812 2008   Cold        22       28    33.3     42.4 68.2  7.6
## 2     200906 2009    Dry         6       16     9.1     24.2 28.8  4.5
## 3     200909 2009    Wet         8       19    12.1     28.8 39.4  1.5
## 4     200912 2009   Cold        16       33    24.2       50 63.6 10.6
## 5     201004 2010    Dry         4       15     6.1     22.7 24.2  4.5
## 6     201006 2010    Dry         8       19    12.1     28.8 37.9    3
## 7     201009 2010    Wet         3       13     4.5     19.7 24.2    0
## 8     201012 2010   Cold        14       29    15.9       33 44.3  4.5
## 9     201103 2011    Dry         7        4    15.9      9.1 15.9  9.1
## 10    201107 2011    Wet        10       12    15.2     18.2 24.2  9.1
## 11    201212 2012   Cold         8        9    12.1     13.6 21.2  4.5
## 12    201311 2013   Cold        16       35    24.2       53 60.6 16.7
## 13    201403 2014    Dry         4        9     6.1     13.6 16.7    3
## 14    201409 2014    Wet         4        8     6.1     12.1 18.2    0
## 15    201412 2014   Cold         1       34     1.5     51.5 51.5  1.5

Scaling Parameter Values

It is often helpful and necessary to scale continuous parameters that have vastly different value ranges (e.g., elevation and NDVI). Doing so can help with model convergence. While conclusions will remain the same, it is important to remember that data are not on the same scale as the original values and must be back-transformed when making raster predictions. This is critical.

Questions:

  1. How would you find out the details of the scale function?
  2. What is the function doing?
  3. Can you write out the function when scale and center are TRUE?
# Scale the parameters...but let's create a copy of the dataset first
Sub.Scale <- Sub.All2

# Use the help(scale) for more information on the function
Sub.Scale$shuman <- as.numeric(scale(Sub.Scale[,"Human"],center=TRUE))
Sub.Scale$sndvi <- as.numeric(scale(Sub.Scale[,"ndvi"],center=TRUE))
Sub.Scale$srough <- as.numeric(scale(Sub.Scale[,"ROUGH"],center=TRUE))
Sub.Scale$sDorcas <- as.numeric(scale(Sub.Scale[,"Dorcas"],center=TRUE))
Sub.Scale$sAddax <- as.numeric(scale(Sub.Scale[,"Addax"],center=TRUE))

Saving Parameter Values

Now that the parameter values have been scaled, we will use these values for modeling purposes. But, when we make the prediction, we will need to back-transform our raster surfaces. Otherwise, the coefficients from our models can’t be applied to the raster grids. To do so, we need the mean and standard deviation of each column. We will then use these values to scale and center each of the raster surfaces.

In this initial step, we create a blank matrix to hold the values. Later in the code, we will refer to the Scale.Val object.

# Create a blank matrix to hold the Mean and Standard Deviation of each continuous variable included the dataset. 
# These values are essential to rescale to original values
Scale.Val <- matrix(NA, 2, ncol(Sub.Scale[,c(15,17,8,12,7)]), dimnames=list(c("Mn","Sd"),c("Rough","NDVI","Dorcas","Human","Addax")))

# Calculate the mean and standard deviations.
Scale.Val[1,] <- apply(Sub.Scale[,c(15,17,8,12,7)],2,mean)
Scale.Val[2,] <- apply(Sub.Scale[,c(15,17,8,12,7)],2,sd)

# Look at values
Scale.Val
##       Rough        NDVI    Dorcas     Human    Addax
## Mn 6.618164 0.092855046  2.922222 0.7393939 1.476768
## Sd 1.464744 0.009629818 10.710220 1.8006750 8.336148
# Create a second object which summarize the data.  Here, I'm just specifying the column numbers.
data.scale <- scale(Sub.Scale[,c(15,17,8,12,7)],center=TRUE)
(summary.data <- apply(data.scale,2,summary))
##                 ROUGH          ndvi        Dorcas         Human
## Min.    -2.822142e+00 -2.702640e+00 -2.728443e-01 -4.106204e-01
## 1st Qu. -6.399369e-01 -6.891035e-01 -2.728443e-01 -4.106204e-01
## Median  -2.043973e-01 -8.768665e-02 -2.728443e-01 -4.106204e-01
## Mean     9.123348e-17  1.666452e-17  2.937262e-17  4.479617e-18
## 3rd Qu.  6.266297e-01  5.734426e-01 -1.794755e-01  1.447269e-01
## Max.     2.953631e+00  4.600067e+00  1.242531e+01  9.030284e+00
##                 Addax
## Min.    -1.771523e-01
## 1st Qu. -1.771523e-01
## Median  -1.771523e-01
## Mean     7.929132e-18
## 3rd Qu. -1.771523e-01
## Max.     1.973612e+01

Correlation Analysis

Like other analyses, we need to evalute data redundancy. Are any of the continuous variables included in our analysis highly correlated?

Questions:

  1. What do we do with categorical variables?
  2. What is a reasonable correlation threshold?
# Group variables together in a dataframe
data.all <- as.data.frame(cbind(Sub.Scale$srough,Sub.Scale$sndvi,Sub.Scale$sDorcas,Sub.Scale$shuman,Sub.Scale$sAddax))

# Variance Inflation Analysis
(eval.vif <- vifstep(data.all))
## No variable from the 5 input variables has collinearity problem. 
## 
## The linear correlation coefficients ranges between: 
## min correlation ( V5 ~ V3 ):  0.002850639 
## max correlation ( V2 ~ V1 ):  -0.3029674 
## 
## ---------- VIFs of the remained variables -------- 
##   Variables      VIF
## 1        V1 1.173900
## 2        V2 1.126290
## 3        V3 1.044492
## 4        V4 1.056243
## 5        V5 1.005890
# See also cor(data.all)

Data Model

Generalized Linear Regression (GLM)

Model the occurrence of addax in a Generalized Linear Regression (GLM) framework. Our goal here was not necessarily to create the very best model. Instead, we aimed to:

  1. Identify the response of all variables at predicting addax occurrence
  2. Evaluate a submodel that contains only the remote sensing layers to make a prediction of habitat suitability across the landscape

We can easily assess how this ‘submodel’ compares with our ‘full’ model or the ‘best’ model.

Question:

  1. Why is using a GLM advantageous given the data and objectives?
  2. When fitting multiple models, how should we evaluate which of the models is the best (Hint: See the dredge function in the library(MuMIn)?
options(na.action=na.fail)
m1 <- dredge(glm.Addax)
head(m1)
# Create a full model with all the variables you think are important predictors of addax occurrence
glm.Addax <- glm(obsAddax ~ srough + I(srough^2) + sndvi + I(sndvi^2) + Human + obsDorc + Stipa1 + Stipa2 + Cornul + Season + Year, data = Sub.Scale, family = binomial(link="logit"))
# Summarize result
summary(glm.Addax)
## 
## Call:
## glm(formula = obsAddax ~ srough + I(srough^2) + sndvi + I(sndvi^2) + 
##     Human + obsDorc + Stipa1 + Stipa2 + Cornul + Season + Year, 
##     family = binomial(link = "logit"), data = Sub.Scale)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6707  -0.5032  -0.3257  -0.1610   3.1998  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.04781    0.43617   0.110 0.912715    
## srough       0.92886    0.18762   4.951 7.39e-07 ***
## I(srough^2) -0.67000    0.13669  -4.902 9.50e-07 ***
## sndvi       -0.14342    0.18245  -0.786 0.431830    
## I(sndvi^2)  -0.28571    0.10769  -2.653 0.007976 ** 
## Human       -0.26027    0.11153  -2.334 0.019610 *  
## obsDorc1     0.85616    0.25931   3.302 0.000961 ***
## Stipa11     -0.30643    0.25512  -1.201 0.229697    
## Stipa21      1.04595    0.26836   3.898 9.71e-05 ***
## Cornul1      0.56465    0.27663   2.041 0.041238 *  
## SeasonDry   -0.17105    0.37617  -0.455 0.649327    
## SeasonWet   -0.24657    0.34744  -0.710 0.477898    
## Year2009    -1.53933    0.51044  -3.016 0.002564 ** 
## Year2010    -2.14737    0.52032  -4.127 3.67e-05 ***
## Year2011    -2.14421    0.62789  -3.415 0.000638 ***
## Year2012    -2.67330    0.62883  -4.251 2.13e-05 ***
## Year2013    -1.34818    0.50000  -2.696 0.007011 ** 
## Year2014    -3.95420    0.59485  -6.647 2.98e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 773.74  on 989  degrees of freedom
## Residual deviance: 606.17  on 972  degrees of freedom
## AIC: 642.17
## 
## Number of Fisher Scoring iterations: 6
# Or print the confidence intervals
#confint(glm.Addax)

Graph Results

Use the visreg and coefplot functions to easily graph the results from a glm.

# Graph result for surface roughness
visreg(glm.Addax,"srough",scale="response",ylab="Prob",partial=TRUE,line=list(col="blue"),fill=list(col="gray"),ylim=c(0,1))

# Plot the coefficients
coefplot(glm.Addax, plot=TRUE, mar=c(1,4,5.1,2), intercept=FALSE, vertical=TRUE, main="", var.las=1, frame.plot=FALSE)

Question:

  1. How could we plot all the variables included in the model or change which variable was output?

Making the Plot Look Nicer

Can we improve the look of these plots and plot the x-axis values with the original values?

mportantly, we need to Here again, look at the help(visreg) to gather information on the items that can be changed/updated. This

# Graph the NDVI response
# ************************
# Since graphing NDVI, require the 2nd column of the minimum and maximum  values calculated above in the summary.data table
MinVal <- summary.data[1,2]
MaxVal <- summary.data[6,2]

    # Then, set the sequence in which to plot the values 
    divs <- 100
    x <- seq(MinVal, MaxVal, length.out=divs)
    x.unscale <- x*Scale.Val[2,2]+Scale.Val[1,2] # Need to keep track of the correct row and column.  Here, multiply by the SD and add the mean

# You can then update the unscaled values into the plot, by specifying the axis with these values.
visreg(glm.Addax,"sndvi",scale="response",ylab="Probability of Occurrence",xlab="NDVI",partial=TRUE, axes=FALSE, rug=0, ylim=c(0,1),line=list(col="black",lwd=2,lty=1),fill=list(col="grey"),points=list(col="black",cex=0.25,pch=19),frame=FALSE,main="Addax")

# Plot axes
axis(2,col="black",col.axis="black")
# Most complicated part......Are just substituting the transformed values for the original data.
axis(1,at=c(x[1],x[50],x[100]),lab=c(round(x.unscale[1],digits=2),round(x.unscale[50],digits=2),round(x.unscale[100],digits=2)),col="black",col.axis="black")

# That's nicer!

Question:

  1. How would you plot the response to surface roughness?
  2. This code for plotting surface roughness is nearly identical to the code for plotting NDVI. Could you automate this and programmatically plot both graphs?

Validation

How do we determine if our models are any good? We can compare our full model to a null model and look at the Area Under the Curve (AUC) statistic. AUC compares the difference between the True Positive Classification Rate and a False Positive Rate (i.e., Specificity vs Sensitivity). Some guidelines to AUC:

  • 0.9 - 1: Excellent (A)
  • 0.8 - 0.9: Good (B)
  • 0.7 - 0.8: Fair (C)
  • 0.6 - 0.7: Poor (D)
  • 0.5 - 0.6: Fail (F)
# What's the AUC?
predpr <- predict(glm.Addax, type=c("response"))
(roccurve <- roc(Sub.Scale$obsAddax ~ predpr))
## 
## Call:
## roc.formula(formula = Sub.Scale$obsAddax ~ predpr)
## 
## Data: predpr in 859 controls (Sub.Scale$obsAddax 0) < 131 cases (Sub.Scale$obsAddax 1).
## Area under the curve: 0.8221
plot(roccurve)

Raster Prediction

One of the most valuable parts of a species distribution model is predicting to locations where surveys were not performed. In order to make a prediction at these locations, we need data that has wall-to-wall coverage. Unfortunately, only two of our layers incorporated in the full model have full coverage (NDVI and Surface Roughness).

Create a model with these two layers, assessing how the model compares with the full model and predict across the entire study area. As you will see, model statistics indicate that this sub-model is not as good as the full model (compare the AIC, AUC). But, perhaps still useful to help guide field teams.

The model is still useful, however, as long as we are clear about its shortcomings (i.e., we’d expect the predictive power to be decreased since we are not including the fine scale data collected at individual plot locations).

Create Model Subset and Evaluate

glm.Addax2 <- glm(obsAddax ~ srough + I(srough^2) + sndvi + I(sndvi^2), data = Sub.Scale, family = binomial(link="logit"))

# Summarize and print confidence intervals
summary(glm.Addax2)
## 
## Call:
## glm(formula = obsAddax ~ srough + I(srough^2) + sndvi + I(sndvi^2), 
##     family = binomial(link = "logit"), data = Sub.Scale)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7518  -0.6128  -0.4999  -0.3055   2.9167  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.31025    0.13616  -9.623  < 2e-16 ***
## srough       0.63602    0.15559   4.088 4.35e-05 ***
## I(srough^2) -0.55024    0.12642  -4.352 1.35e-05 ***
## sndvi        0.08772    0.12001   0.731   0.4648    
## I(sndvi^2)  -0.21425    0.08969  -2.389   0.0169 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 773.74  on 989  degrees of freedom
## Residual deviance: 734.51  on 985  degrees of freedom
## AIC: 744.51
## 
## Number of Fisher Scoring iterations: 6
# AUC
predpr <- predict(glm.Addax2, type=c("response"))
(roccurve <- roc(Sub.Scale$obsAddax ~ predpr))
## 
## Call:
## roc.formula(formula = Sub.Scale$obsAddax ~ predpr)
## 
## Data: predpr in 859 controls (Sub.Scale$obsAddax 0) < 131 cases (Sub.Scale$obsAddax 1).
## Area under the curve: 0.6573
plot(roccurve)

Load Raster Layers

Load the NDVI data and SRTM data to make a model prodection.

Remember, that we still will need to re-scale the raster layers, since our model results (our coefficients) are based on data that was also scaled. We will use an NDVI image that was available from November 2007. This was done because we also have an ancillary dataset (i.e., flight survey) that we used as an external dataset to assess model performance (data not shown).

NDVI

# Load NDVI data from flight survey date
# Note that this file has 250-meter resolution (MOD13Q1 data product)
# The SRTM data has 30-m resolution
# We need these data sources to have the same resolution in order to make the prediction. 

# Use the 2007 data from November for validation...matches the flight survey
ndvi <- raster("Data/MOD13Q1_Nov2017.tif")
# Convert raster to values to actual NDVI values
ndvi <- ndvi*0.0001

# Data needs to be summarized at 2.5km and scaled to match
# Create a focal grid....to match the resampling done at the survey points.....this just creates a matrix
# This is confusing, but it is just a weighted grid...to summary values within the grid
FoGrid <- focalWeight(ndvi,d=2500,type='circle')

# Now Summarize the NDVI within the focalWeight grid
ndvi2 <- focal(x=ndvi,w=FoGrid,fun=sum,na.rm=TRUE) # Need to use sum....because the focalWeight grid...creates a matrix of values that add to 1.  We want a summary of values within the focal grid

# Plot result
plot(ndvi2)

Surface Roughness

Now do the same procedure for the Surface Roughness layer. Rough is a 30-m resolution file, so will take longer to process.

# Create a different focalWeight grid because the cell resolutions are different (30 meters instead of 250 meters)
rough <- raster("Data/Rough_Sub.tif")

FoGrid1 <- focalWeight(rough,d=2500,type='circle')
rough2 <- focal(x=rough,w=FoGrid1,fun=sum,na.rm=TRUE)

# Plot layer
plot(rough2)

Scale Rasters and Resample

Scale the rasters using the value summaries (mean and standard deviation) created above. Then, the NDVI raster file needs to be resampled to match the SRTM datafile. The extent of these files also needs to match.

Question:

  1. What interpolation method should you use when resampling the raster dataset?
# Scale values. To back transform, you need to:x - mean(x) / sd(x))
Scale.Val
##       Rough        NDVI    Dorcas     Human    Addax
## Mn 6.618164 0.092855046  2.922222 0.7393939 1.476768
## Sd 1.464744 0.009629818 10.710220 1.8006750 8.336148
srough <- (rough2-Scale.Val[1,1])/Scale.Val[2,1]
sndvi <- (ndvi2-Scale.Val[1,2])/Scale.Val[2,2]

# Resample the grids so that they can be added together in the model.  
# This may take a long time if interpolating from 250 to 30 m
# Much quicker if going to 30 to 250 m
# Why would you want to do one resolution over the other?
#sndvi <- resample(sndvi,srough,method="bilinear") # 250 - 30 m (Slow)
srough <- resample(srough, sndvi, method="bilinear") # 30 - 250 m (Fast)

# Compare the resolutions 
compareRaster(srough,sndvi)
## [1] TRUE

Make Final Prediction

Now that the rasters have been scaled, we can make a final prediction. This can be accomplished by using the coefficients from the model and manually inputting them into the equation. Alternatively, we can use the predict command. Using the predict command requires that we create a rasterBrick with the same variable names as included in our model.

# Summarize the model
summary(glm.Addax2)
## 
## Call:
## glm(formula = obsAddax ~ srough + I(srough^2) + sndvi + I(sndvi^2), 
##     family = binomial(link = "logit"), data = Sub.Scale)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7518  -0.6128  -0.4999  -0.3055   2.9167  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.31025    0.13616  -9.623  < 2e-16 ***
## srough       0.63602    0.15559   4.088 4.35e-05 ***
## I(srough^2) -0.55024    0.12642  -4.352 1.35e-05 ***
## sndvi        0.08772    0.12001   0.731   0.4648    
## I(sndvi^2)  -0.21425    0.08969  -2.389   0.0169 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 773.74  on 989  degrees of freedom
## Residual deviance: 734.51  on 985  degrees of freedom
## AIC: 744.51
## 
## Number of Fisher Scoring iterations: 6
# Manually:
# We could physically calculate the prediction from the model coefficients:
##coef <- summary(glm.Addax2)
##coef <- coef$coefficients

##Addax.predict <- (exp(coef[1] + rgh.scale*coef[2] + rgh.scale^2*coef[3] + ndvi.rsmp*coef[4] + ndvi.rsmp^2*coef[5])/(1 + exp(coef[1] + rgh.scale*coef[2] + rgh.scale^2*coef[3] + ndvi.rsmp*coef[4] + ndvi.rsmp^2*coef[5])))

# Using Predict:
# Create quadratic raster layers to include in the rasterBrick
srough2 <- srough^2
sndvi2 <- sndvi^2

# Add to brick and rename layer names
satImage <- brick(srough, srough2, sndvi, sndvi2)
names(satImage) <- c("srough","srough2", "sndvi", "sndvi2")

# Predict and export image to directory
Addax.predict <- predict(satImage, glm.Addax2, type="response", progress='text')
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |================                                                 |  25%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=================================================                |  75%
  |                                                                       
  |=================================================================| 100%
## 
# Plot result
plot(Addax.predict)

# Or write the raster to the directory separately
#writeRaster(Addax.predict, 'Test.tif', format="GTiff", datatype = "INT1U", overwrite=TRUE)

Plot Presence Threshold

What if I only wanted to see values above a certain threshold? Using simple raster math, we could visualize only the areas that meet our criteria.

# Create presence threshold
presenceThresh <- 0.13

# Threshold the result and plot
Addax.Thresh <- Addax.predict >= presenceThresh
plot(Addax.Thresh)

Interactive Mapping

Lastly, we can also take this one step further and plot our predicted surface on an interactive map, so that we have a better idea of its context in the real world and also provide a way to post our results on relevant social media.

# Create color ramp
pal <- colorNumeric(rev(terrain.colors(10)), values(Addax.predict), na.color = "transparent")

# Reproject raster predict to Lat/Long
proj.info <- crs(Addax.predict)
proj.info.LongLat <- CRS("+proj=longlat +datum=WGS84")
Addax.predict.lat <- projectRaster(Addax.predict, crs=proj.info.LongLat) # Use default resolution

leaflet() %>%
  addTiles() %>%
  addRasterImage(Addax.predict, colors = pal, opacity = 0.9) %>%
  addLegend(pal = pal, values = values(Addax.predict),
    title = "Predicted Addax Occurrence")